SRAM-Based PUF Readouts

Large-scale parameter characterization of Physical Unclonable Functions (PUFs) is of paramount importance in order to assess the quality and thus the suitability of such PUFs which would then be developed as an industrial-grade solution for hardware root of trust. Carrying out a proper characterization requires a large number of devices that need to be repeatedly sampled at various conditions. These prerequisites make PUF characterization process a very time-consuming and expensive task. Our work presents a dataset for the study of SRAM-based PUFs on microcontrollers; it includes full SRAM readouts along with internal voltage and temperature sensors of 84 microcontrollers of STM32 type. Data has been gathered with a custom-made and open platform designed for the automatic acquisition of SRAM readouts of such devices. This platform also provides possibilities of experimenting aging and reliability properties.

1. be able to gather data from a large number of devices in order to ensure statistically relevant results. 2. automatically power-cycle each device as characterizing SRAM-based PUFs involves such compulsory and time-consuming process. 3. guarantee data integrity, considering storage over time and transmission without corruption. 4. be scalable, to allow for future development on other microcontrollers of various vendors. This platform has been designed to work with any microcontroller but the data has been assembled from 84 boards of STM32L152RE by ST Microelectronics. Moreover, it also provides easy access to a comprehensive database involving thousands of samples of multiple boards; it is of high interest since there are barely no dataset available to the public. Additionally, this platform offers advantages over ad-hoc solutions normally built for this purpose.
Our platform will save many resources to the users in terms of money (buying hundreds of devices) and time (collecting thousands of samples). The availability of raw-data will empower any user to carry out various experiments (e.g. designing new post-processing, searching for systematic variations, etc.) with a number of samples and devices big enough to consider the experiment statistically significant. Besides, the extra information provided (operating conditions, wafer position, etc.) will open a variety of options to detect vulnerabilities and develop new metrics. Additionally, this platform gives access to real-time core voltage and temperature of the μC thanks to the measurements of sensors integrated on each μC. Operating conditions have proven of paramount importance for the stability of PUF responses so the information provided by the on-chip sensors is necessary in any analysis.
As a totally new feature to the best of our knowledge, we offer the possibility to interact with the boards by controlling the On/Off switch time of microcontrollers (data remanence studies) and to write custom values in the SRAM (NBTI studies). One can write any value at any SRAM region facilitating NBTI effects. It is well-known that storing certain value in a SRAM cell (e.g.,0), reinforces the tendency of power-up to the opposite value (e.g.,1) due to aging mechanisms 8 . NBTI, standing for Negative Bias Temperature Instability, is a common phenomenon in PMOS transistors, increasing the threshold voltage and thus resulting in the decrease of drain current and transconductance. With time, inverters in the SRAM cells may not behave as initially set up. Reliability being the capacity of a PUF to produce the same response under various conditions and timeframes, in 9 it is proposed to induce aging into the transistors in order to improve this reliability. For that purpose, the opposite value of the one originally obtained is written into the cells.
Platform description. The platform comprises diverse components designed to work together in order to easily gather data from a vast amount of devices while ensuring its integrity. The code is built with focus on scalability to ensure the application of different devices without altering the main mechanics of the station. The platform receives instructions from a message broker, allowing for the necessary commands to be performed on the devices; data is stored in a SQL database; every process is monitored and logged in real time to flag unexpected behaviour or error. The station is depicted in Fig. 1.
The requests from the message broker are split into different atomic "commands". They are queued so that multiple commands can be sent at the same time but be executed sequentially. A timeout for each command is also set to avoid hanging forever. Communication with the devices is performed through a defined interface that translates commands into necessary packets of custom-made communication protocol. This process makes Communication protocol. A packet-based communication protocol has been designed to transfer data between the PC and the devices. The packet contains bytes to transmit as well as metadata. Data is stored in the devices in a C struct, serialized when sending and deserialized when receiving. These packets are used to build atomic operations that will carry out the designed actions. Table 1 presents the different atomic operations that the platform can perform. They are the basis of more complex commands Tables 1, 2.
Custom code execution. As shown above, the platform provides commands for custom code exception. Code execution is performed thanks to zForth (https://github.com/zevv/zForth) being is a subset of Forth, designed to be highly portable. Each device comprises an instance of zForth interpreter that can read code from a buffer. As an example, the following code calculates the Fibonacci sequence from 1 to 1024.
: fib 1 1 begin.. dup rot rot + dup 1024 > until; fib The same code can be executed multiple times without need of loading every instance. The results are automatically written in a circular buffer that can later be retrieved. Users will be able to exploit this functionality for their experiments: make a request that will be queued and once done, the data will be sent to them directly.

Devices.
In order to maximize the number of connections to the PC, devices are united in Daisy Chains and two UART ports are plugged to communicate with the rest of the devices in both directions. All the devices are programmed with the same source code, which makes the process of adding, removing and changing equipment fairly trivial. The physical position of each one is detected through a field in our communication protocol called PIC, standing for Position In Chain. The value starts at 0 at PC level and is incremented with 1 for every jump the packet performs downstream. This field along with the 96-bit identifier provided by the manufacturer stored in memory allows to fully know the position of a device in the chain. The automatic power cycle of the devices is handled with a YKUSH USB (https://www.yepkit.com/products/ykush) hub, enabling the control of three USB ports power independently.

Monitoring.
To monitor the status of the platform at any given time, each packet contains a header with metadata about the command to execute and additional options. To ensure preventing Bit Flips during data transmission, PUF quality parameters being very sensitive to bit changes, every packet comprises a checksum calculated as the CRC16 of the packet. Additionally, since every operation the platform performs is atomic, any Command Description ACK Acknowledgement of an operation. An ACK is sent by a device, for example after a WRITE operation to inform the PC that the command has successfully been executed.

PING
This command gives the PC the number of devices that are connected in a chain and their SRAM sizes.

READ
Read a region of information of a device. The Options field in the packet contains the offset to read from. The offset is the number of 512 bytes to skip. If the region cannot be read or the checksum does not match, an ERR is transmitted back to the station.

WRITE
Write a series of bytes in a memory region of a device. The body in the packet contains the bytes to write. An ERR is sent back to the station in case the checksum does not fit.

INVERT
This command employs the WRITE command to deliver the opposite values of the first sample.

SENSORS
Extract the sensors information from a device. Microcontrollers connected to the platform have temperature and core voltage sensors.

LOAD
Load source code to a device that would later be executed.

EXEC
Execute code loaded by the LOAD command and store the results into a circular buffer that can later be retrieved.

RETR
Retrieve results stored after the code has been executed.

ERR
Error during communication. It can be a wrong checksum or a problem during the parsing of a packet.  www.nature.com/scientificdata www.nature.com/scientificdata/ problems that may occur at any given moment can easily be located and reported. As an example, Table 2 portrays potential issues that could be detected while performing a WRITE operation on a device.
Fail-safe mechanism are introduced to detect more complex trouble and immediately stop the platform to secure data integrity. (i.e. a command could hang indefinitely waiting for lost bytes or part of the devices could remain inaccessible due to a physical problem in the chain).

Database.
A SQL database is used for reliable long term storage and effortless data filtering. Communication with the database is designed to be agnostic so any SQL database should work; for this project we have chosen PostgreSQL. The memory and sensors readouts are stored into two separate tables. Tables 3, 4 represent them. Along with sample information, a UTC timestamp is stored with each document to keep track of when the sample was extracted.

Data Records
A static version of the full dataset available at the time of writing can be downloaded from Zenodo https://doi. org/10.5281/zenodo.7529513 10 . More data can be found online as explained in section 5 (Usage notes). The full dataset contains SRAM readouts of 84 Nucleo microcontrollers of STM32 type along with their voltage and temperature sensor data. This dataset is composed of two CSV files: • The first one houses the readout of the SRAM memories. There are 9 samples per device and each memory is split into 160 regions of 512 bytes, summing up to a total of 120961 rows. • The second files accommodates temperature and voltage sensor readouts. It contains 11 readouts, 9 of which were performed along with the SRAM memory readouts. There are a total of 925 rows.
The fields of both files are detailed in Tables 5, 6:

Technical Validation
The data provided by this platform has not been pre-processed so as to keep raw data of the SRAM. Nevertheless, it is still important to ensure that the data provided does not present faulty bits and complies with the quality metrics depicted in the literature. Earlier, we have already detailed the different techniques of monitoring and error reporting used to guarantee the physical integrity of data Table 7.
The minimum time required to assure that every device is correctly turned on or off is approximately 30 seconds. For every power cycle we have waited at least one minute; comfort interval to make sure that the SRAM contents are completely erased, knowing that an attacker could exploit data remanence and get PUF responses indirectly 11 .
Moreover, the quality of the samples has been studied with the canonical quality metrics published by Maiti et al. 2 . We will provide a summary of such metrics in order to assess the performance of SRAM on the μCs as a PUF. The purpose of these widely used metrics is to find vulnerabilities in the PUF behaviour such as bias in the distribution of 1 s and 0 s, systematic variation, etc. These metrics are updated every 24 hours. Nevertheless, the user can select an specific device and obtain its metrics in real time. For the metrics that require reference values of response bits (e.g. reliability), these values are derived by Majority Voting among samples. In the near future, we plan to include new metrics such as the ones presented in 12 .
Before explaining the calculation of the metrics, a small visual aid on how calculations are performed is displayed in Fig. 2. Each row contains all the bits of each device and is arranged into a 3D matrix, where the third dimension gathers all the readouts of the devices. Internal ID of the sample.   www.nature.com/scientificdata www.nature.com/scientificdata/ Uniformity. It measures the randomness of each device. It is calculated as the average of all responses of each device, meaning the average of the responses across rows. The following formula is used to calculate the uniformity of each device, where C represents the number of bits in the SRAM. This value should be 0.5 to ensure an even distribution of 0 s and 1 s.
Bit-aliasing. It assesses the randomness of each challenge across devices. It is calculated in the similar manner as uniformity but on columns instead of rows. The following formula is used for each challenge where D represents the number of devices that are studied. As with uniformity, this value should be 0.5.
Uniqueness. This metric measures the difference in responses from pairs of devices, hoping there is enough randomness across devices. The HD norm function refers to the normalized hamming distance and C to all the responses from a device in a given sample. The ideal value should be close to 0.5 to assure that each device produces a unique set of CRPs.
Field Description board_type Device model that is connected to the chain. This dataset only contains Nucleo.
uid STM32 96-bit ID formatted as 24 Hex-character string. This UID includes the batch number of the device, its wafer number, wafer lot and the X and Y position of the μC on the wafer.

pic
The Position In Chain. Devices are assembled in a chain-like architecture to maximize their connections to the computer that performs the readout. The first device has a PIC of 1 and the last one has a PIC of 84. address The address of each memory region where the data was read. Each memory region contains exactly 512 bytes. For Nucleo devices, there are 160 regions in total. The SRAM memory of them starts at address 0 × 20000000.
data 512 bytes as unsigned integers of the memory region. The values are separated with commas and surrounded with double quotation marks.
created_at Timestamp when data was gathered in ISO format (YYYY-MM-DD hh:mm:ss) Table 5. CSV file housing SRAM memory readouts.

Field Description
board_type Identical to description in Table 5.
uid Identical to description in Table 5.
pic Identical to description in Table 5.
temperature Temperature of the device in Celsius.
voltage Internal voltage of the device in Volts.
created_at Identical to description in Table 5. Table 6. CSV file housing temperature and voltage sensors.

Name Possible values
Date

4-digit year
BoardId "all" or "0x" + 24-character UID + "−" + pic CircuitId "all" or "NUCLEO" or "DISCOVERY" www.nature.com/scientificdata www.nature.com/scientificdata/ Reliability. It determines the variability of responses in time and different conditions. The reliability of each device is calculated by grouping all responses into a vector and performing hamming distance with all the rest of the samples. The ideal value should be 1 to prove that responses do not change in time. It is important to mention that at least two samples are needed to calculate reliability as the hamming distance of a sample with itself is 0.
s S norm i ref These quality metrics are measured when a new readout is performed and monitored through Grafana (https://grafana.com/) to asses potential problems during the readout. Figure 3 displays two snapshots of the Grafana dashboard checking the quality of the samples. Figure 4 shows a heat map of a memory sample of the 84 devices connected to the platform. Each row corresponds to a device and colors indicate the value of each memory cell. The areas at the beginning and at the end of each SRAM are full of 0 s as they are used for the stack to load the executable code and buffers. To secure the proper operation of the devices, these areas are read but not written. This heat map can be seen as one of the 2D matrix adopted to calculate metrics.   www.nature.com/scientificdata www.nature.com/scientificdata/ chain; therefore, the memory from all 84 devices would be retrieved in about 24 hours. In case of error during communication, a packet notifying the faulty bits is sent back to the PC and the information is retransmitted, with a consequent increase in transmission delay.  www.nature.com/scientificdata www.nature.com/scientificdata/ The other limitation is that the control of voltage and temperature is not feasable at the moment; the devices are subjected to environmental conditions.

Usage Notes
Although the dataset described in this document is static, more data can be requested online through the following application hosted at https://puf4iot.univ-grenoble-alpes.fr/form.php. Any user can submit their need to the server specifying the data they want to retrieve, and a CSV or zip file (if larger records) will be handed over.
The database with raw data is updated once a week. The SRAM start-up values have been organized in memory regions of 512 bytes, proving a good trade-off between usability and data integrity. Each sample has been time-stamped in order to know the collection date. Figure 5 pictures the entry point of the website. Users provide their information and are given access to the main site where they can ask for the needed data guided by filters. The website will then fetch the required data and generate the CSV or zip file.
Data can also be asked by making an HTTP request to the server with the filter parameters. Curl or Wget commands can be employed to get data directly from a terminal.

Code availability
The source code of the platform and the STM32 devices are available under the GPL-2.0 license at https://github. com/servinagrero/SRAMPlatform. Online documentation on the platform and guidance on custom station set up can be found at https://servinagrero.github.io/SRAMPlatform. The full list of python dependencies is available in the pyproject.toml file in GitHub repository. PostgreSQL (https://www.postgresql.org/) is needed to store data, a message broker is necessary to communicate with the station, RabbitMQ (https://www.rabbitmq.com/)in our case, and Grafana is used to monitory metrics and sensors in a dashboard.